Many-Task Computing and Blue Waters

نویسندگان

  • Daniel S. Katz
  • Timothy G. Armstrong
  • Zhao Zhang
  • Michael Wilde
  • Justin M. Wozniak
چکیده

This report discusses many-task computing (MTC), both generically and in the context of the proposed Blue Waters systems. Blue Waters is planned to be the largest supercomputer funded by NSF when it begins production use in 2011–2012 at NCSA. The aim of this report is to inform the Blue Waters project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies on Blue Waters. Many MTC applications do not neatly fit the stereotypes of highperformance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems such as Blue Waters, with its high degree of parallelism and support for intensive communication, is well suited for achieving low turnaround times with large, intensive MTC applications. However, HPC systems often lack a dynamic resourceprovisioning feature, are not ideal for task communication via the file ∗Please cite as: D. S. Katz, T. G. Armstrong, Z. Zhang, M. Wilde, and J. M. Wozniak, Many Task Computing and Blue Waters. Technical Report CI-TR-13-0911. Computation Institute, University of Chicago & Argonne National Laboratory. http://www.ci.uchicago. edu/research/papers/CI-TR-13-0911 1 ar X iv :1 20 2. 39 43 v1 [ cs .D C ] 1 7 Fe b 20 12 system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Dataflow Programming on Blue Waters

This work presents the analysis of hybrid dataflow programming over XK7 nodes of Blue Waters using a novel CUDA framework GeMTC. GeMTC is an execution model and runtime system that enables accelerators to be programmed with many concurrent and independent tasks of potentially short or variable duration. With GeMTC, a broad class of such “many-task” applications can leverage the increasing numbe...

متن کامل

Optimization Task Scheduling Algorithm in Cloud Computing

Since software systems play an important role in applications more than ever, the security has become one of the most important indicators of softwares.Cloud computing refers to services that run in a distributed network and are accessible through common internet protocols. Presenting a proper scheduling method can lead to efficiency of resources by decreasing response time and costs. This rese...

متن کامل

MP-sort: Sorting at Scale on Blue Waters – for a Cosmological Simulation

We implement and investigate a parallel sorting algorithm (MP-sort) on Blue Waters. MP-sort sorts distributed array items with non-unique integer keys into a new distributed array. The sorting algorithm belongs to the family of partition sorting algorithms: the target storage space of a parallel computing rank is represented by a histogram bin whose edges are determined by partitioning the inpu...

متن کامل

An Effective Task Scheduling Framework for Cloud Computing using NSGA-II

Cloud computing is a model for convenient on-demand user’s access to changeable and configurable computing resources such as networks, servers, storage, applications, and services with minimal management of resources and service provider interaction. Task scheduling is regarded as a fundamental issue in cloud computing which aims at distributing the load on the different resources of a distribu...

متن کامل

Workload Analysis of Blue Waters

Blue Waters is a Petascale-level supercomputer whose mission is to enable the national scientific and research community to solve"grand challenge"problems that are orders of magnitude more complex than can be carried out on other high performance computing systems. Given the important and unique role that Blue Waters plays in the U.S. research portfolio, it is important to have a detailed under...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1202.3943  شماره 

صفحات  -

تاریخ انتشار 2012